ARTEMIS-UBIMEDIA at TRECVid 2012: Instance Search Task
نویسندگان
چکیده
This paper describes the approach proposed by ARTEMIS-UBIMEDIA team at TRECVID 2012, Instance Search (INS) task [1]. The method is based on the Bag-of-Words representation obtained from uniform sampling of the frames of the videos. We propose a query expansion technique that employs the textual description of the queries to identify new instances of the query objects on Flickr in order to enrich the query descriptor with additional representative instances. 1 Structured Abstract Briefly, what approach or combination of approaches did you test in each of your submitted runs? (please use the run id from the overall results table NIST returns) all runs: 1 frame per second sampling from the videos, frames resized to 384x288 surface, Hessian Affine detectors and RootSIFT descriptor. F_X_NO_UbiBWVTR_1: BoW vectors generated at shot level. Single query BoW vector generated from the multiple example images. F_X_NO_UbiBWVHF_2: BoW vectors generated at shot level. Query BoW vectors generated from images fetched from Flickr using the provided query textual description. F_X_NO_UbiBWFFM_3: BoW vectors generated for each frame and for each query image. The score of the frame yielding the best score among the frames of a shot for a query is selected. After the querying the top 500 results are re-ranked with a color consistency check using MPEG-7 Color Structure the descriptor for the whole image queries and a region based object detection method for the partial image/object queries. F_X_NO_UbiBWFFR_4: BoW vectors generated for each frame and for each query image. The score of the frame yielding the best score among the frames of a shot for a query is selected. For multiple queries of the same topic, the best score of the video clip among the different query runs is selected. What if any significant differences (in terms of what measures) did you find among the runs? Overall, the grouping of the video frames in a single video clip descriptor has yielded the best results while reducing significantly the number of BoW vectors to be compared. The Flickr-based expanded queries have provided satisfying results by employing images crawled from the internet and the query images (without using the binary mask). Based on the results, can you estimate the relative contribution of each component of your system/approach to its effectiveness? The large size of the vocabulary has compensated the reduced number of detected interest points from the resized video frames. The images collected from Flickr have improved the results for a number of topics which had less representative query images or reduced sizes of the query entities.
منابع مشابه
ARTEMIS-UBIMEDIA at TRECVid 2011: Instance Search
This paper describes the approach proposed by ARTEMISUBIMEDIA team at TRECVID 2011, Instance Search (INS) task. The method is based on a semi-global image representation relying on an over-segmentation of the keyframes. An aggregation mechanism was then applied in order to group a set of sub-regions into an object similar to the query, under a global similarity criterion.
متن کاملThe University of Sheffield and Harbin Engineering University at TRECVID 2012 : Instance Search
This paper describes our contribution to instance search (INS) task for TRECVID 2012. We present four approaches for this task, (i) histograms of SIFT features as feature vectors and Bhatacharya distance for similarity detection (ii) feature vector is combination of SIFT features alone, while for matching we used a basic descriptor matching algorithm (iii) IR based approach using SIFT features ...
متن کاملPKU-ICST at TRECVID 2012: Instance Search Task
We participate in all two types of instance search task in TRECVID 2012: automatic search and interactive search. This paper presents our approaches and results. In this task, we mainly focus on exploring the effective feature representation, feature matching, re-ranking algorithm and query expansion. In feature representation, we adopt two basic visual features and five keypoint-based BoW feat...
متن کاملNTT Communication Science Laboratories at TRECVID 2014 Instance Search Task
This paper reports our method and experimental result on the TRECVID 2014[1] instance search task. Since 2012, we have been applying BM25 (Best Match 25), i.e., the state-of-the-art probabilistic information retrieval method in the field of text retrieval, to the instance search task. The standard BM25 uses the well-known Inverse Document Frequency (IDF) as the key-point discriminative power, a...
متن کاملTRECVid 2012 Experiments at Dublin City University
Following previous participations in TRECVid, this year, the DCU-IAD team participated in four tasks of TRECVid 2012: Instance Search (INS), Interactive Known-Item Search (KIS), Multimedia Event Detection (MED) and Multimedia Event Recounting (MER).
متن کامل